Robust Imputation of Missing Values in Compositional Data Using the -Package robCompositions
نویسندگان
چکیده
The aim of this contribution is to show how the R-package robCompositions can be applied to estimate missing values in compositional data. Two procedures are summarized, one of them being highly stable also in presence of outlying observations. Measures for information loss are presented, and it is demonstrated how they can be applied. Moreover, we introduce new diagnostic tools that are useful for inspecting the quality of the imputed data.
منابع مشابه
Accuracy evaluation of different statistical and geostatistical censored data imputation approaches (Case study: Sari Gunay gold deposit)
Most of the geochemical datasets include missing data with different portions and this may cause a significant problem in geostatistical modeling or multivariate analysis of the data. Therefore, it is common to impute the missing data in most of geochemical studies. In this study, three approaches called half detection (HD), multiple imputation (MI), and the cosimulation based on Markov model 2...
متن کاملImputation of missing values for compositional data using classical and robust methods
New imputation algorithms for estimating missing values in compositional data are introduced. A first proposal uses the k-nearest neighbor procedure based on the Aitchison distance, a distance measure especially designed for compositional data. It is important to adjust the estimated missing values to the overall size of the compositional parts of the neighbors. As a second proposal an iterativ...
متن کاملInfluence of Pattern of Missing Data on Performance of Imputation Methods: An Example from National Data on Drug Injection in Prisons
Background Policy makers need models to be able to detect groups at high risk of HIV infection. Incomplete records and dirty data are frequently seen in national data sets. Presence of missing data challenges the practice of model development. Several studies suggested that performance of imputation methods is acceptable when missing rate is moderate. One of the issues which was of less concern...
متن کاملچند رویکرد برخورد با مقادیر گمشده متغیرهای کمی و بررسی اثر آنها بر نتایج حاصل از یک کارآزمایی بالینی
Background and Objectives: A major challenge that affects the longitudinal studies is the problem of missing data. Missing in the data may result in the loss of part of the information which reduces the accuracy of the estimator and obtain the results will be biased and inaccurate. Therefore, it is necessary to evaluate the missing data mechanism from a longitudinal research and to consider thi...
متن کاملMissing data imputation in multivariable time series data
Multivariate time series data are found in a variety of fields such as bioinformatics, biology, genetics, astronomy, geography and finance. Many time series datasets contain missing data. Multivariate time series missing data imputation is a challenging topic and needs to be carefully considered before learning or predicting time series. Frequent researches have been done on the use of diffe...
متن کامل